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This paper presents a critique of teachers ’ assessment practices from a social justice 
standpoint. It is based on two studies of different aspects of the established system of 
teacher assessment in the UK. Each study found that teachers’ own perspectives, 
resources and interpretations led to the construction of views of students ’ mathematics 
which differed from those constructed by other teachers or researchers. The authors 
m conclude that professional critique of assessment decisions made about individuals 

needs to be developed, and raise further research questions about equity in assessment 

CA . 

practices. 



Recent research and curriculum development related to assessment in mathematics 
education has been associated with changes in conceptions of the nature of 
mathematical learning. There have been some powerful critiques of the poverty of the 
information provided by some traditional methods of assessment and of the inequity 
inherent in them (e.g., Burton, 1994). Assessment methods associated with “reform” 
curricula have also begun to be critiqued from an equity perspective (e.g., Cooper & 
Dunne, 1998). In rejecting crude positivism, however, most reformers and researchers 
have still made the assumption that, with “improved” methods, it is possible to achieve 
valid insight into students’ thinking. We wish to problematise this assumption. 



In a number of countries, the wish to broaden the repertoire of strategies for assessing 
students in mathematics has led to an increase in the responsibility devolved to teachers 
for assessing their own students. Among the potential problems, Clarke (1996) 
identifies teachers’ lack of expertise in “devising, employing and interpreting 
assessment tasks” (p. 333). This formulation assumes that it is possible to have 
“expertise” in interpreting students’ responses and that teachers can be supported in 
gaining this. We shall argue, both theoretically and through our empirical investigations 
of the practices of teachers engaged in such interpretation, that this is not a simple issue. 
The two independently conceived but complementary research programmes that we 
shall describe (Morgan, 1998; Watson, 1997; 1998) both address concerns with the 
nature of teachers’ “expertise” as assessors. 




Assessment is contextualised and interpretative 

Every act of assessment takes place within a specific context, consisting not only of the 
current circumstances but also of the resources brought to bear by the assessor 
(including personal knowledge of mathematics and of the curriculum, experience and 
expectations of a particular child and of children in general, beliefs about assessment, 
experience of the classroom, etc.). These “reader resources” (Fairclough, 1989) arise 
both from the assessor’s personal, social and cultural history and from their current 
positioning within a particular discourse. The professional enculturation of teachers 
seems likely to ensure a certain degree of common resource, as may be seen in the 
success of training programmes and joint moderation meetings in achieving consensus 
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about ranks, levels and grades in relation to individual pieces of work and portfolios 
(see, for example, Wiliam, 1994). However, teacher-assessors will also make use of 
more individual resources, including their past and present relationships with individual 
students or with groups of students and the extent of their authority and autonomy as 
professionals. A range of positions are available to teachers in relation to their students, 
other teachers and external authorities (Morgan, 1994). Different positionings are likely 
to give rise to the use of different sets of resources and hence to different actions and 
judgements by different teachers or by a single teacher at different times in different 
circumstances (cf. Evans & Tsatsaroni, 1998; Walkerdine, 1988) . 

The two studies of teacher-assessment that we discuss below are located within 
interpretative paradigms. In both studies attention was focused on possible ways of 
interpreting texts, where “text” is taken to encompass physical actions, facial or body 
language as well as spoken or written language. The teacher “reads” any text produced 
by a student in an interpretative and contextualised way, constructing meaning in terms 
of the student’s mathematical attainment. A new orthodoxy of mathematical activity 
and interactive classroom practice, rooted in various versions of constructivism, is fairly 
well-established in mathematics education, based on recognition of the interpretative 
and contingent nature of students’ construction of knowledge. Less consideration has 
been given to how teachers base pedagogic decisions on their constructions of 
individual students’ mathematics. Though Simon (1995), for example, writes that “the 
teacher can compare his understanding of a particular concept to his construction of the 
students’ understandings, not to the students’ ‘actual’ understandings” (p. 1 35), the 
cumulative effect of such interpretations is not discussed, and any doubts about the 
validity of summative statements of student achievement tend to be expressed in terms 
of the quality of the sampling of student behaviour (due to situatedness, temporal 
specificity, assessment style etc.) rather than in terms of its interpretation. 

The impressions and expectations of their students that teachers develop through their 
readings of the evidence available to them are incorporated into the curriculum and into 
future interactions with the student. Hence expectations tend, unsurprisingly, to be 
fulfilled (Nash, 1976). Early impressions are therefore crucially important and may have 
such a strong effect that subsequent events are noticed and interpreted only insofar as 
they support the original impression (Nisbett and Ross, 1980). Moreover, teachers’ 
expectations about students’ mathematical learning may not be based on mathematical 
evidence, but on the evidence of other behaviour, social skills, and social class 
background (McIntyre et al, 1966; Walkerdine, 1984). What appears salient to one 
teacher may not to another; this is a particular issue within mathematics with its wide 
range of possible modes of communication. 

The research studies 

The two studies we will discuss originated within the assessment system in UK state 
schools, where teachers’ assessments of students’ mathematics have been a statutory 
part of national assessment procedures for over ten years. All teachers have been trained 



to incorporate assessment into their teaching, and to use assessment criteria. While the 
particular national context necessarily affects the detail of the teacher-assessor practices 
that we will describe, our analyses illuminate broader issues which have to be seen as 
features of an established system. 

Study A: Teacher’s constructions of views of students’ mathematics 

The initial aim of study A was to develop a description of ways in which teachers 
interpret and accumulate their experience of students in mathematics classrooms, not 
only during formal assessment situations but also during normal day-to-day activity 
(Watson, 1995; 1999). Interviews with teachers revealed a number of problematic 
aspects of practice. When asked about individual students, teachers talked more of their 
learning behaviour in general than specific achievements. For example, it was common 
to say that students who were regarded as strong mathematicians were “well-organised 
and self-motivated” or “quick” rather than to describe particular mathematical features 
of their work. Although teachers thought it very important to take into account students’ 
oral work, they tended to do this unsystematically. A combination of records of written 
work and personal recollection formed the basis for most teachers’ assessments. 
Teachers spoke of “getting to know” students as a frame for interpreting their work, 
rather than as an outcome of informal assessment. Summative assessment was done 
partly by personal recollection - some teachers, knowing the flaws of testing, regarded 
this as much fairer than testing. 

These issues were explored further in a study of two teachers “getting to know” new 
classes (Watson, 1997). The researcher observed ten students in one lesson a week with 
each teacher during the first term with the new class. All public verbal utterances and 
some one-to-one interactions by the target students were noted, all written work 
produced by each student during the term was scrutinised, behaviour and actions were 
observed and noted using a combination of systematic observation punctuated with 
records of complete incidents, as interpreted by the researcher. Considerably more 
observed information about behaviour and actions of the target students was available to 
the researcher than to the teacher, though they had similar oral data and the same 
written data; the researcher also had more time to consider the data. Both the researcher 
and the teacher formed views about the target students’ mathematics and discussed 
them regularly with each other, but the differences in the data to which they had access 
contributed to substantially different interpretations. The following example illustrates 
typical differences. 

Sandra (aged 1 1) took part enthusiastically in the routine start of each lesson when the 
class marked their own answers to mental arithmetic questions done at home. She 
sometimes called out answers and often put her hand up energetically, waving it around 
to attract attention. Nearly all her enthusiastic contributions to class arose from work 
done at home or, very occasionally, from discussions the teacher had with her. When 
her answers were wrong Sandra made comments like “but my father said ....” It was as 
if she knew that to get approval from the teacher you had to show right answers 



publicly, and she was skilled in getting those right answers not from her own head but 
from her father or from the teacher. Other evidence to support an interpretation that 
Sandra was weak in number work was her use of fingers, particularly for subtraction, in 
situations where a competent arithmetician would have known number bonds or have 
developed some patterning strategies. 

The teacher, while being aware that Sandra sometimes altered answers as she marked 
them, was not aware of the extent of the alterations and had a view that Sandra was 
mainly good at mental arithmetic, changing answers in order to boost her confidence. 
The teacher’s estimation of her arithmetic, initially low as a result of a test, had risen to 
a relatively high level as a result of her oral contributions to class. The researcher’s 
estimation was that she lacked skills, lacked useful internalisation of arithmetical facts, 
and had previously relied on performance of algorithms which she failed to remember. 
It was as if the teacher had a neutral view of her until something outstanding happened, 
in this case her enthusiastic contribution to homework feedback sessions. Once there 
had been an event which allowed him to differentiate her mathematics from that of 
other students, the view he formed remained strongly with him so that subsequent 
events were interpreted in that light. Hence her alteration of answers became under- 
represented in his picture of her and the pattern of her responses was not obvious to 
him. 

In contrast, the teacher felt that Sandra was relatively weak in her ability to think 
mathematically while using and applying mathematics or when tackling new ideas. 
However, several instances of mathematical thinking were inferred by the researcher 
from Sandra’s verbal comments and from her actions. In these instances she appeared 
able to devise strategies, adapt strategies which have been effective in the past, describe 
patterns and make conjectures resulting from patterns. 

Relative to the researcher, therefore, the teacher appeared to overestimate skills in the 
area of mathematics in which Sandra wanted him to be interested, and to underestimate 
her skills of reasoning. The teacher, seeing her work always in the context of what the 
rest of the class did and what his own expectations were, makes a judgement which is 
comparative to what she has done before and to the rest of the class. But “what she has 
done before” includes creating an impression in his mind, therefore judgements are 
relative to the picture already formed. Of course, the researcher was able to see a pattern 
to Sandra’s oral contributions that was not visible to the teacher. It was the pattern and 
circumstances of the contributions, rather than their quantity or nature, which indicated 
the contrast between her arithmetical abilities and her desire to be good with 
calculations. The teacher has seen this analysis and accepts it as a description of aspects 
of Sandra’s work and the problems of informal judgement of which he was not, and 
could not be aware. The competing demands on the teacher's attention in a classroom 
make detailed observation impossible. 

Analysis of data about the ten students in this study led to the identification of a number 
of issues (see Watson (1997) for a fuller discussion): 



• teachers may see only part of the whole story; 

• teachers may see, or fail to see, patterns in responses and behaviour; 

• some behaviour may be over- or under-represented in the teacher’s mental picture; 

• teachers may be strongly influenced by students’ strong or weak social skills; 

• teachers interpret work in the light of existing impressions; 

• time constraints on the teacher prevent full exploration of mathematics; 

• perceptions of external purposes affect assessment; 

• teachers are unable to see and use all the details which occur in classrooms. 

In no sense are we suggesting that there is a true view to be achieved, nor that the 
researcher is correct and the teacher wrong. On the contrary, the study suggests that 
teachers’ informal assessments, made in classroom contexts, are inevitably influenced 
by a variety of unavoidable factors which may have little to do with mathematical 
achievement. In the UK system, these informal assessments contribute to formal, 
summative, high-stakes assessments as the teacher brings her perceptions of the student 
to bear on subsequent interpretations of mathematical performance. It is of concern that 
this takes place within a system in which teachers have been trained to use detailed, 
tested, criteria, and where assessments contribute to high-stakes decisions. 

Study B: Teacher assessment of written mathematics in a high-stakes context 

The second study was set in the context of the high-stakes GCSE examination for 
students aged 16+ in England and Wales. The coursework component of this 
examination, which is assessed by teachers, most commonly takes the form of reports of 
one or more extended investigative tasks. These reports are intended to include evidence 
of the mathematical processes that students have gone through (for example: 
systematising, observing, conjecturing, generalising, justifying) as well as the results of 
their investigation. The original concern of this study was to investigate the forms of 
writing that students produced in their coursework and to consider the match or 
mismatch between student writing and the forms of mathematical writing valued by 
teacher assessors (Morgan, 1998). Analysis of interviews with 1 1 experienced teachers 
reading and evaluating students’ coursework texts explored the teachers’ assessment 
practices, the features of the texts that the teachers attended to, and the values that came 
into play as they formed judgements about the texts and about the student-writers. The 
issue emerging from the results of these analyses that we wish to consider here is the 
diversity that was discovered, both in the ways different teachers approached the task of 
reading and assessing student texts and in the meanings and evaluations they 
constructed from the same texts (Morgan, 1996; 1998). 

All the teachers had been trained in the use of common sets of criteria, were 
experienced in applying these criteria to their own students’ work, and had participated 
in moderation processes. Nevertheless, their relationships to the criteria and their 
methods of approaching the task of applying them were in some cases very different. 
The following example illustrates the ways in which teachers reading with different 
resources can arrive at very different judgements about the same student. 
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One student, Steven, had written a report on his work on a task called ‘Topples’ that 
involved investigating piles built up of rods of increasing lengths, seeking a relationship 
between the length of the rod at the bottom of the pile and the length of the rod that 
would first make the pile topple over. Steven found a formula expressing this 

A\ 

relationship, ( A + A) + — = , and showed that he could use it to find the length of 
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the ‘topple rod’ for piles starting with rods longer than those he had available to build 
with. He then presented an alternative method for finding results for piles starting with 
very long rods by scaling up his results for piles starting with short rods: 

An alternative way to do this would be to take the result of a pile starting at 
10 and multiply it by 10 

no' 



( 10 + 10 ) = 20 



2 



= 5 



20 + 5 = 25 

25x10 = 250 



No derivation or further justification of this method was provided in the text. In order to 
evaluate Steven’s work, each teacher-reader constructed his or her own understanding, 
not only of the method itself but also of the means by which Steven might have derived 
it and of Steven’s level of mathematical achievement. The following extracts from 
interviews with two teachers reading this section of Steven’s work illustrate how 
different these understandings can be. 

Teacher 1 : Charles Um ok so I mean he ’s found the rule and he ’s quite 
successfully used it from what 1 can see to make predictions about what ’s 
going to happen for things that he obviously can ’t set up. So that shows that 
he understands the formula which he ’s come up with quite well, I think. 

There ’s also found some sort of linearity in the results whereby he can just 
multiply up numbers which again shows quite a good understanding of the 
problem I think. 

Charles recognises the mathematical validity of the alternative method, relating it to the 
linearity of the relationship between the variables. He takes this as a sign that the 
student has “come up with” the formula as a result of understanding the linearity of the 
situation. This results in a positive evaluation of Steven’s mathematical understanding. 

Teacher 2: Grant It ’s interesting that the next part works, I don 7 know if it 
works for everything or it just works for this but he ’s spotted it and again he 
hasn 7 really looked into it any further. He ’s done it for one case but 
whether it would work for any other case is er I don 7 know, he hasn 7 
looked into it [...] And it ’s just a knowledge of number that ’s got him there I 
think intuition whatever. He may have guessed at a few and found one that 
works for it. 
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Grant appears less confident with the mathematical validity of the alternative formula, 
expressing some uncertainty about whether the method would work in general. Perhaps 
because of this uncertainty, his narrative explaining how Steven might have arrived at 
the method devalues the student’s achievement, suggesting that the processes involved 
were not really mathematical: “spotting” the method, not looking into it properly, 
guessing, using “just a knowledge of number” or “intuition”. 

In this case, the teachers’ different interpretations of Steven’s level of understanding 
and different hypotheses about the methods he might have used to achieve his results 
seem to be connected to their personal mathematical resources. It is Charles, expressing 
a clear understanding of the relationship of the alternative method to the linearity of the 
situation, who makes the most positive evaluation of Steven’s understanding, while 
Grant , apparently uncertain of the general validity of the method, constructs a picture 
of the student working in relatively unstructured or experimental ways. Other 
differences in interpretation and evaluation of Steven’s results and his means of 
achieving them were also evident in other teachers’ readings of his text. 

In order to make sense of a text and to use it to evaluate the student- writer’s 
achievement, each teacher must compose an explanatory narrative, drawing on the 
resources available to them. These resources include common expectations of the 
general nature of investigation and investigative reports. Where students had produced 
‘standard’ work and expressed it using the conventions of the investigation report genre 
it was found that teachers’ evaluations coincided closely. However, if a student text 
diverges from the ‘usual’ to the extent that it is not covered by the established common 
expectations, each teacher must resort to their more personal resources, thus creating the 
possibility of divergence in the narratives they compose. Major differences between 
teachers in their interpretations and evaluations of students’ texts occurred primarily 
where the student’s text diverged from the ‘norm’ in some way — either in its 
mathematical content or in the form in which it was expressed. 

Other ways in which teachers’ interpretations and approaches to evaluation were found 
to differ included: 

• different hypotheses about work that the student might have done in the classroom 
that was not indicated in the written text - and different attitudes towards valuing 
such unrecorded achievement; 

• different judgements about factual aspects of the text, for example, whether the 
wording of the problem given by the student had been copied from the original 
source or had been paraphrased in the student’s own words; 

• different approaches to the task of assessing a piece of student work: some teachers 
appeared to focus on building up an overall picture of the characteristics of the 
student, some were interested in making mathematical sense of what the student had 
done, while others focussed solely on finding evidence in the text to meet specific 
criteria; 



• different ways of resolving tensions between the teachers’ own value systems and 
their perceptions of the demands of externally imposed sets of assessment criteria. 

Conclusions 

The two studies illustrate the variations that are possible in judgements teachers make 
about students’ mathematical achievements and suggest some of the sources of these 
variations. The first study suggests that teachers form views of students' mathematical 
strengths and weaknesses based on information which is inevitably partial, due to the 
impossibility of seeing and hearing everything, the need to stress some aspects of a 
student's performance and decide which others are unimportant. The second study 
shows that judgements about written work in mathematics are intimately connected with 
the values and experiences that teachers bring to their interpretations of student text. It 
seems likely that such values and previous experiences are also significant in 
influencing the aspects of student behaviour that teachers notice and attach importance 
to in the classroom. While the evidence teachers are able to attend to in the classroom is 
inevitably partial and will vary between different observers, even when exactly the same 
evidence is available to all observers (as it is in the case of written texts) the two studies 
show that different teachers can interpret the same or similar student texts of all kinds in 
very different ways, attending to different salient features and placing different values 
on similar features. These differences can occur both in informal classroom assessment 
and in formal high-stakes situations. 

In both studies the teachers were experienced and had been trained in assessment 
methods. They were also aware that they were involved in research about assessment. 
They may thus be assumed to have been making judgements in the most professional 
ways they knew about. Differences are unlikely, therefore, to be attributable to a lack of 
skill or a lack of professionalism. In both situations the judgements could influence 
what happens next in the student's mathematical career. Judgements made in the 
everyday classroom influence the teacher’s future interactions with the student and the 
mathematical teaching provided, while judgements in more formally summative 
contexts have obvious ‘high-stakes’ consequences for progression to further education 
and employment opportunities. Yet, in both situations the positions and priorities, 
values and experience of the teacher influence the judgement. 

While it may be possible to resolve such differences in summative assessments where 
there is time available and teachers can meet and discuss decisions (Clarke, 1996; 
NCTM, 1995; SEAC, 1991), we would argue that, because of the interpretative nature 
of any act of assessment, it is not possible to avoid differences altogether. This raises 
important issues for equity in education both in summative and formative situations. In 
particular, those students who do not share the social, cultural and linguistic background 
of the teacher-assessor are less likely to have access to the resources that would enable 
them to produce texts that will have the features that the teacher will attend to and value 
highly. Moreover, the consequences of even informal teacher judgements are not merely 
fleeting but have lasting influence on the educational opportunities available to students. 



We are not suggesting that teachers should not make judgements; people have to make 
subjective judgements about others in order to communicate. Moreover, we are not 
arguing that teachers are “bad” at assessing and need to be trained in “correct” methods. 
Rather we would argue for further professionalisation of the ways teachers read and 
evaluate students’ mathematical texts, and the ways they make use of their informal 
evaluations. This might involve the development of self-critical approaches to making 
assessments about others, including awareness of the relative, partial, and interpretative 
nature of assessments, of the prejudices which may influence them and of the ways in 
which assessment can deny students access to equal opportunities. A critical 
professional environment in which colleagues are expected to question and justify 
decisions could support such development. 

Further questions 

In considering the implications for future research related to teacher assessment in 
mathematics education, we are concerned both with the quality and nature of the 
judgements made by teachers and with the issues of equity raised by questions about 
students’ access to the means of expression that are likely to lead to high evaluation of 
their mathematical competence. Questions that need to be addressed include: 

• What are the characteristics of student behaviour that will lead to a student being 
seen as a competent mathematician, and that will lead to high evaluations? 

• To what extent are students belonging to various social and cultural groups aware of 
their teachers’ values and expectations and able to demonstrate them? 

• To what extent are teachers, teacher educators and assessment designers aware of 
the ways in which assessments may be influenced by various teachers’ values and 
expectations? 

• How can teachers raise students’ awareness of their assessors’ values and 
expectations and enable students to behave in (mathematical, linguistic and social) 
ways that will lead to high evaluations? 

• Do the differences between and within teachers’ judgements act structurally to 
disadvantage certain social groups? How do any such inequities relate to those 
known to be inherent in formal written tests? 
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